Overview

Brought to you by YData

Dataset statistics

Number of variables14
Number of observations47619
Missing cells0
Missing cells (%)0.0%
Duplicate rows44
Duplicate rows (%)0.1%
Total size in memory5.4 MiB
Average record size in memory120.0 B

Variable types

Numeric11
Categorical3

Alerts

Dataset has 44 (0.1%) duplicate rowsDuplicates
gender is highly overall correlated with relationshipHigh correlation
relationship is highly overall correlated with genderHigh correlation
race is highly imbalanced (65.9%) Imbalance
workclass has 1426 (3.0%) zeros Zeros
marital-status has 6563 (13.8%) zeros Zeros
occupation has 5559 (11.7%) zeros Zeros
relationship has 19172 (40.3%) zeros Zeros
capital-gain has 43674 (91.7%) zeros Zeros
capital-loss has 45379 (95.3%) zeros Zeros
native-country has 824 (1.7%) zeros Zeros

Reproduction

Analysis started2025-07-12 08:02:38.185642
Analysis finished2025-07-12 08:02:55.458780
Duration17.27 seconds
Software versionydata-profiling vv4.16.1
Download configurationconfig.json

Variables

age
Real number (ℝ)

Distinct59
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean38.230664
Minimum17
Maximum75
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size744.0 KiB
2025-07-12T08:02:55.568698image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum17
5-th percentile19
Q128
median37
Q347
95-th percentile62
Maximum75
Range58
Interquartile range (IQR)19

Descriptive statistics

Standard deviation13.199351
Coefficient of variation (CV)0.3452556
Kurtosis-0.54604325
Mean38.230664
Median Absolute Deviation (MAD)10
Skewness0.4425353
Sum1820506
Variance174.22286
MonotonicityNot monotonic
2025-07-12T08:02:55.696263image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
36 1330
 
2.8%
35 1328
 
2.8%
33 1312
 
2.8%
23 1307
 
2.7%
31 1303
 
2.7%
34 1294
 
2.7%
30 1268
 
2.7%
28 1264
 
2.7%
37 1254
 
2.6%
38 1252
 
2.6%
Other values (49) 34707
72.9%
ValueCountFrequency (%)
17 590
1.2%
18 853
1.8%
19 1041
2.2%
20 1101
2.3%
21 1081
2.3%
22 1161
2.4%
23 1307
2.7%
24 1190
2.5%
25 1176
2.5%
26 1138
2.4%
ValueCountFrequency (%)
75 68
 
0.1%
74 70
 
0.1%
73 102
0.2%
72 113
0.2%
71 111
0.2%
70 130
0.3%
69 144
0.3%
68 170
0.4%
67 232
0.5%
66 231
0.5%

workclass
Real number (ℝ)

Zeros 

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.041391
Minimum0
Maximum6
Zeros1426
Zeros (%)3.0%
Negative0
Negative (%)0.0%
Memory size744.0 KiB
2025-07-12T08:02:55.792371image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q13
median3
Q33
95-th percentile5
Maximum6
Range6
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.1423469
Coefficient of variation (CV)0.37560015
Kurtosis1.7659471
Mean3.041391
Median Absolute Deviation (MAD)0
Skewness0.15933069
Sum144828
Variance1.3049565
MonotonicityNot monotonic
2025-07-12T08:02:55.871487image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
3 33089
69.5%
5 3744
 
7.9%
1 3089
 
6.5%
2 2639
 
5.5%
6 1973
 
4.1%
4 1659
 
3.5%
0 1426
 
3.0%
ValueCountFrequency (%)
0 1426
 
3.0%
1 3089
 
6.5%
2 2639
 
5.5%
3 33089
69.5%
4 1659
 
3.5%
5 3744
 
7.9%
6 1973
 
4.1%
ValueCountFrequency (%)
6 1973
 
4.1%
5 3744
 
7.9%
4 1659
 
3.5%
3 33089
69.5%
2 2639
 
5.5%
1 3089
 
6.5%
0 1426
 
3.0%

fnlwgt
Real number (ℝ)

Distinct27805
Distinct (%)58.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean189143
Minimum12285
Maximum1490400
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size744.0 KiB
2025-07-12T08:02:55.983362image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum12285
5-th percentile39478
Q1117359.5
median177858
Q3236696
95-th percentile378036
Maximum1490400
Range1478115
Interquartile range (IQR)119336.5

Descriptive statistics

Standard deviation105421.76
Coefficient of variation (CV)0.5573654
Kurtosis6.2203837
Mean189143
Median Absolute Deviation (MAD)60069
Skewness1.4540404
Sum9.0068003 × 109
Variance1.1113748 × 1010
MonotonicityNot monotonic
2025-07-12T08:02:56.139774image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
203488 21
 
< 0.1%
120277 19
 
< 0.1%
190290 19
 
< 0.1%
125892 18
 
< 0.1%
126569 18
 
< 0.1%
126675 17
 
< 0.1%
99185 17
 
< 0.1%
113364 17
 
< 0.1%
186934 16
 
< 0.1%
111567 16
 
< 0.1%
Other values (27795) 47441
99.6%
ValueCountFrequency (%)
12285 1
 
< 0.1%
13492 1
 
< 0.1%
13769 3
< 0.1%
13862 1
 
< 0.1%
14878 1
 
< 0.1%
18827 1
 
< 0.1%
19214 1
 
< 0.1%
19302 6
< 0.1%
19395 2
 
< 0.1%
19410 2
 
< 0.1%
ValueCountFrequency (%)
1490400 1
< 0.1%
1484705 1
< 0.1%
1455435 1
< 0.1%
1366120 1
< 0.1%
1268339 1
< 0.1%
1226583 1
< 0.1%
1210504 1
< 0.1%
1184622 1
< 0.1%
1161363 1
< 0.1%
1125613 1
< 0.1%

educational-num
Real number (ℝ)

Distinct13
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.217602
Minimum4
Maximum16
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size744.0 KiB
2025-07-12T08:02:56.242254image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile6
Q19
median10
Q313
95-th percentile14
Maximum16
Range12
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.3776774
Coefficient of variation (CV)0.23270405
Kurtosis0.083528899
Mean10.217602
Median Absolute Deviation (MAD)1
Skewness0.011172643
Sum486552
Variance5.6533497
MonotonicityNot monotonic
2025-07-12T08:02:56.353394image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
9 15655
32.9%
10 10824
22.7%
13 7983
16.8%
14 2634
 
5.5%
11 2053
 
4.3%
7 1801
 
3.8%
12 1592
 
3.3%
6 1373
 
2.9%
4 899
 
1.9%
15 819
 
1.7%
Other values (3) 1986
 
4.2%
ValueCountFrequency (%)
4 899
 
1.9%
5 745
 
1.6%
6 1373
 
2.9%
7 1801
 
3.8%
8 654
 
1.4%
9 15655
32.9%
10 10824
22.7%
11 2053
 
4.3%
12 1592
 
3.3%
13 7983
16.8%
ValueCountFrequency (%)
16 587
 
1.2%
15 819
 
1.7%
14 2634
 
5.5%
13 7983
16.8%
12 1592
 
3.3%
11 2053
 
4.3%
10 10824
22.7%
9 15655
32.9%
8 654
 
1.4%
7 1801
 
3.8%

marital-status
Real number (ℝ)

Zeros 

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.6080766
Minimum0
Maximum6
Zeros6563
Zeros (%)13.8%
Negative0
Negative (%)0.0%
Memory size744.0 KiB
2025-07-12T08:02:56.438211image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median2
Q34
95-th percentile5
Maximum6
Range6
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.5032867
Coefficient of variation (CV)0.57639668
Kurtosis-0.55800219
Mean2.6080766
Median Absolute Deviation (MAD)2
Skewness-0.035057734
Sum124194
Variance2.2598709
MonotonicityNot monotonic
2025-07-12T08:02:56.522028image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
2 21768
45.7%
4 15850
33.3%
0 6563
 
13.8%
5 1481
 
3.1%
6 1352
 
2.8%
3 568
 
1.2%
1 37
 
0.1%
ValueCountFrequency (%)
0 6563
 
13.8%
1 37
 
0.1%
2 21768
45.7%
3 568
 
1.2%
4 15850
33.3%
5 1481
 
3.1%
6 1352
 
2.8%
ValueCountFrequency (%)
6 1352
 
2.8%
5 1481
 
3.1%
4 15850
33.3%
3 568
 
1.2%
2 21768
45.7%
1 37
 
0.1%
0 6563
 
13.8%

occupation
Real number (ℝ)

Zeros 

Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.4396984
Minimum0
Maximum14
Zeros5559
Zeros (%)11.7%
Negative0
Negative (%)0.0%
Memory size744.0 KiB
2025-07-12T08:02:56.601672image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q13
median7
Q310
95-th percentile13
Maximum14
Range14
Interquartile range (IQR)7

Descriptive statistics

Standard deviation4.3514722
Coefficient of variation (CV)0.67572608
Kurtosis-1.2661807
Mean6.4396984
Median Absolute Deviation (MAD)4
Skewness0.1109995
Sum306652
Variance18.93531
MonotonicityNot monotonic
2025-07-12T08:02:56.700806image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
10 6129
12.9%
3 6024
12.7%
2 5990
12.6%
0 5559
11.7%
12 5442
11.4%
7 4715
9.9%
6 2869
6.0%
8 2639
5.5%
14 2294
 
4.8%
5 1975
 
4.1%
Other values (5) 3983
8.4%
ValueCountFrequency (%)
0 5559
11.7%
1 15
 
< 0.1%
2 5990
12.6%
3 6024
12.7%
4 1353
 
2.8%
5 1975
 
4.1%
6 2869
6.0%
7 4715
9.9%
8 2639
5.5%
9 198
 
0.4%
ValueCountFrequency (%)
14 2294
 
4.8%
13 1444
 
3.0%
12 5442
11.4%
11 973
 
2.0%
10 6129
12.9%
9 198
 
0.4%
8 2639
5.5%
7 4715
9.9%
6 2869
6.0%
5 1975
 
4.1%

relationship
Real number (ℝ)

High correlation  Zeros 

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.4501355
Minimum0
Maximum5
Zeros19172
Zeros (%)40.3%
Negative0
Negative (%)0.0%
Memory size744.0 KiB
2025-07-12T08:02:56.793855image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q33
95-th percentile4
Maximum5
Range5
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.6049031
Coefficient of variation (CV)1.1067263
Kurtosis-0.77177702
Mean1.4501355
Median Absolute Deviation (MAD)1
Skewness0.78236617
Sum69054
Variance2.5757139
MonotonicityNot monotonic
2025-07-12T08:02:56.869994image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 19172
40.3%
1 12229
25.7%
3 7520
 
15.8%
4 4998
 
10.5%
5 2291
 
4.8%
2 1409
 
3.0%
ValueCountFrequency (%)
0 19172
40.3%
1 12229
25.7%
2 1409
 
3.0%
3 7520
 
15.8%
4 4998
 
10.5%
5 2291
 
4.8%
ValueCountFrequency (%)
5 2291
 
4.8%
4 4998
 
10.5%
3 7520
 
15.8%
2 1409
 
3.0%
1 12229
25.7%
0 19172
40.3%

race
Categorical

Imbalance 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size744.0 KiB
4
40739 
2
4588 
1
 
1464
0
 
461
3
 
367

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters47619
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row4
3rd row4
4th row2
5th row4

Common Values

ValueCountFrequency (%)
4 40739
85.6%
2 4588
 
9.6%
1 1464
 
3.1%
0 461
 
1.0%
3 367
 
0.8%

Length

2025-07-12T08:02:56.958798image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-12T08:02:57.045080image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
4 40739
85.6%
2 4588
 
9.6%
1 1464
 
3.1%
0 461
 
1.0%
3 367
 
0.8%

Most occurring characters

ValueCountFrequency (%)
4 40739
85.6%
2 4588
 
9.6%
1 1464
 
3.1%
0 461
 
1.0%
3 367
 
0.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 47619
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
4 40739
85.6%
2 4588
 
9.6%
1 1464
 
3.1%
0 461
 
1.0%
3 367
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 47619
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
4 40739
85.6%
2 4588
 
9.6%
1 1464
 
3.1%
0 461
 
1.0%
3 367
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 47619
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
4 40739
85.6%
2 4588
 
9.6%
1 1464
 
3.1%
0 461
 
1.0%
3 367
 
0.8%

gender
Categorical

High correlation 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size744.0 KiB
1
31759 
0
15860 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters47619
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
1 31759
66.7%
0 15860
33.3%

Length

2025-07-12T08:02:57.141677image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-12T08:02:57.201952image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
1 31759
66.7%
0 15860
33.3%

Most occurring characters

ValueCountFrequency (%)
1 31759
66.7%
0 15860
33.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 47619
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 31759
66.7%
0 15860
33.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 47619
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 31759
66.7%
0 15860
33.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 47619
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 31759
66.7%
0 15860
33.3%

capital-gain
Real number (ℝ)

Zeros 

Distinct122
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1088.8692
Minimum0
Maximum99999
Zeros43674
Zeros (%)91.7%
Negative0
Negative (%)0.0%
Memory size744.0 KiB
2025-07-12T08:02:57.294644image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile5013
Maximum99999
Range99999
Interquartile range (IQR)0

Descriptive statistics

Standard deviation7495.3827
Coefficient of variation (CV)6.8836392
Kurtosis151.0297
Mean1088.8692
Median Absolute Deviation (MAD)0
Skewness11.83411
Sum51850862
Variance56180761
MonotonicityNot monotonic
2025-07-12T08:02:57.446908image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 43674
91.7%
15024 513
 
1.1%
7688 408
 
0.9%
7298 361
 
0.8%
99999 241
 
0.5%
3103 151
 
0.3%
5178 145
 
0.3%
5013 117
 
0.2%
4386 107
 
0.2%
8614 82
 
0.2%
Other values (112) 1820
 
3.8%
ValueCountFrequency (%)
0 43674
91.7%
114 8
 
< 0.1%
401 3
 
< 0.1%
594 51
 
0.1%
914 10
 
< 0.1%
991 5
 
< 0.1%
1055 37
 
0.1%
1086 5
 
< 0.1%
1111 1
 
< 0.1%
1151 13
 
< 0.1%
ValueCountFrequency (%)
99999 241
0.5%
41310 2
 
< 0.1%
34095 6
 
< 0.1%
27828 58
 
0.1%
25236 14
 
< 0.1%
25124 6
 
< 0.1%
22040 1
 
< 0.1%
20051 41
 
0.1%
18481 1
 
< 0.1%
15831 8
 
< 0.1%

capital-loss
Real number (ℝ)

Zeros 

Distinct99
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean87.912619
Minimum0
Maximum4356
Zeros45379
Zeros (%)95.3%
Negative0
Negative (%)0.0%
Memory size744.0 KiB
2025-07-12T08:02:57.577681image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum4356
Range4356
Interquartile range (IQR)0

Descriptive statistics

Standard deviation403.19364
Coefficient of variation (CV)4.5862999
Kurtosis19.542084
Mean87.912619
Median Absolute Deviation (MAD)0
Skewness4.534775
Sum4186311
Variance162565.11
MonotonicityNot monotonic
2025-07-12T08:02:57.708774image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 45379
95.3%
1902 303
 
0.6%
1977 253
 
0.5%
1887 231
 
0.5%
2415 72
 
0.2%
1485 71
 
0.1%
1848 67
 
0.1%
1590 62
 
0.1%
1602 61
 
0.1%
1740 58
 
0.1%
Other values (89) 1062
 
2.2%
ValueCountFrequency (%)
0 45379
95.3%
155 1
 
< 0.1%
213 5
 
< 0.1%
323 5
 
< 0.1%
419 3
 
< 0.1%
625 17
 
< 0.1%
653 4
 
< 0.1%
810 2
 
< 0.1%
880 6
 
< 0.1%
974 2
 
< 0.1%
ValueCountFrequency (%)
4356 1
 
< 0.1%
3900 2
 
< 0.1%
3770 4
 
< 0.1%
3683 2
 
< 0.1%
3175 2
 
< 0.1%
3004 5
 
< 0.1%
2824 14
< 0.1%
2754 2
 
< 0.1%
2603 4
 
< 0.1%
2559 17
< 0.1%

hours-per-week
Real number (ℝ)

Distinct96
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40.564733
Minimum1
Maximum99
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size744.0 KiB
2025-07-12T08:02:57.839298image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile18
Q140
median40
Q345
95-th percentile60
Maximum99
Range98
Interquartile range (IQR)5

Descriptive statistics

Standard deviation12.304123
Coefficient of variation (CV)0.30332069
Kurtosis3.0028944
Mean40.564733
Median Absolute Deviation (MAD)3
Skewness0.26442778
Sum1931652
Variance151.39143
MonotonicityNot monotonic
2025-07-12T08:02:57.988541image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
40 22243
46.7%
50 4201
 
8.8%
45 2679
 
5.6%
60 2160
 
4.5%
35 1881
 
4.0%
20 1781
 
3.7%
30 1646
 
3.5%
55 1037
 
2.2%
25 921
 
1.9%
48 759
 
1.6%
Other values (86) 8311
 
17.5%
ValueCountFrequency (%)
1 23
 
< 0.1%
2 43
 
0.1%
3 49
 
0.1%
4 75
 
0.2%
5 84
 
0.2%
6 82
 
0.2%
7 41
 
0.1%
8 202
0.4%
9 27
 
0.1%
10 396
0.8%
ValueCountFrequency (%)
99 134
0.3%
98 14
 
< 0.1%
97 2
 
< 0.1%
96 7
 
< 0.1%
95 2
 
< 0.1%
94 1
 
< 0.1%
92 3
 
< 0.1%
91 3
 
< 0.1%
90 42
 
0.1%
89 3
 
< 0.1%

native-country
Real number (ℝ)

Zeros 

Distinct42
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean36.928159
Minimum0
Maximum41
Zeros824
Zeros (%)1.7%
Negative0
Negative (%)0.0%
Memory size744.0 KiB
2025-07-12T08:02:58.121389image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile20
Q139
median39
Q339
95-th percentile39
Maximum41
Range41
Interquartile range (IQR)0

Descriptive statistics

Standard deviation7.5771962
Coefficient of variation (CV)0.20518749
Kurtosis14.203959
Mean36.928159
Median Absolute Deviation (MAD)0
Skewness-3.8834254
Sum1758482
Variance57.413903
MonotonicityNot monotonic
2025-07-12T08:02:58.250135image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=42)
ValueCountFrequency (%)
39 43234
90.8%
0 824
 
1.7%
26 601
 
1.3%
30 276
 
0.6%
11 206
 
0.4%
2 177
 
0.4%
33 170
 
0.4%
19 150
 
0.3%
5 123
 
0.3%
9 122
 
0.3%
Other values (32) 1736
 
3.6%
ValueCountFrequency (%)
0 824
1.7%
1 26
 
0.1%
2 177
 
0.4%
3 116
 
0.2%
4 81
 
0.2%
5 123
 
0.3%
6 85
 
0.2%
7 41
 
0.1%
8 105
 
0.2%
9 122
 
0.3%
ValueCountFrequency (%)
41 22
 
< 0.1%
40 80
 
0.2%
39 43234
90.8%
38 26
 
0.1%
37 29
 
0.1%
36 65
 
0.1%
35 111
 
0.2%
34 20
 
< 0.1%
33 170
 
0.4%
32 55
 
0.1%

income
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size744.0 KiB
<=50K
36029 
>50K
11590 

Length

Max length5
Median length5
Mean length4.7566098
Min length4

Characters and Unicode

Total characters226505
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<=50K
2nd row<=50K
3rd row>50K
4th row>50K
5th row<=50K

Common Values

ValueCountFrequency (%)
<=50K 36029
75.7%
>50K 11590
 
24.3%

Length

2025-07-12T08:02:58.373720image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-12T08:02:58.457351image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
50k 47619
100.0%

Most occurring characters

ValueCountFrequency (%)
0 47619
21.0%
5 47619
21.0%
K 47619
21.0%
< 36029
15.9%
= 36029
15.9%
> 11590
 
5.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 226505
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 47619
21.0%
5 47619
21.0%
K 47619
21.0%
< 36029
15.9%
= 36029
15.9%
> 11590
 
5.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 226505
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 47619
21.0%
5 47619
21.0%
K 47619
21.0%
< 36029
15.9%
= 36029
15.9%
> 11590
 
5.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 226505
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 47619
21.0%
5 47619
21.0%
K 47619
21.0%
< 36029
15.9%
= 36029
15.9%
> 11590
 
5.1%

Interactions

2025-07-12T08:02:53.526245image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:39.831014image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:41.206748image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:42.453629image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:43.751992image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:45.202089image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:46.423943image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:47.666523image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:49.704298image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:51.146823image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:52.304877image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:53.625321image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:39.949920image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:41.315022image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:42.566951image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:43.860831image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:45.306348image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:46.532188image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:47.798159image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:49.857295image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:51.246071image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:52.412326image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:53.734214image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:40.060188image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:41.426369image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:42.681195image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:43.979754image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:45.417076image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:46.644620image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:47.971728image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:50.029159image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:51.349771image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:52.527967image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:53.846608image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:40.172469image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:41.541368image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:42.812913image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:44.092088image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:45.532210image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:46.761608image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:48.139093image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:50.208319image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:51.457443image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:52.640639image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:53.948989image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:40.278934image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:41.667144image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:42.923008image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:44.424453image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:45.642274image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:46.881925image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:48.296821image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:50.384659image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:51.565714image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:52.749964image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:54.053513image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:40.391304image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:41.773453image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:43.060271image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:44.533202image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:45.744192image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:46.989640image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:48.453928image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:50.499233image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:51.669340image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:52.855506image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:54.179715image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:40.498628image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:41.883979image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:43.182690image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:44.643057image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:45.868506image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:47.105855image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:48.613201image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:50.609098image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:51.774200image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:52.975050image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:54.304230image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:40.606430image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:42.008657image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:43.292502image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:44.751506image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:45.975950image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:47.226517image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:49.083288image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:50.712706image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:51.880851image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:53.081292image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:54.744253image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:40.719597image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:42.130733image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:43.413783image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:44.875382image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:46.093177image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:47.336385image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:49.240838image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:50.814037image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:51.984995image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:53.206716image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:54.843874image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:40.817909image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:42.237854image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:43.522365image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:44.977089image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:46.200053image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:47.447699image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:49.393426image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:50.919404image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:52.083810image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:53.312087image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:54.947740image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:41.096457image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:42.351013image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:43.633687image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:45.095169image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:46.308943image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:47.556613image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:49.549323image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:51.024470image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:52.204520image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-12T08:02:53.418453image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Correlations

2025-07-12T08:02:58.532411image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
agecapital-gaincapital-losseducational-numfnlwgtgenderhours-per-weekincomemarital-statusnative-countryoccupationracerelationshipworkclass
age1.0000.1260.0590.082-0.0750.1310.1670.322-0.3890.008-0.0070.031-0.3240.065
capital-gain0.1261.000-0.0670.118-0.0070.0490.0940.269-0.0760.0120.0150.014-0.1010.027
capital-loss0.059-0.0671.0000.0770.0000.0650.0590.198-0.0420.0050.0160.014-0.0640.010
educational-num0.0820.1180.0771.000-0.0180.0750.1620.361-0.062-0.0060.1080.065-0.1000.030
fnlwgt-0.075-0.0070.000-0.0181.0000.025-0.0190.0100.038-0.059-0.0010.0700.014-0.031
gender0.1310.0490.0650.0750.0251.0000.2440.2180.4590.0300.3810.1150.6470.153
hours-per-week0.1670.0940.0590.162-0.0190.2441.0000.269-0.2070.0100.0130.059-0.3090.119
income0.3220.2690.1980.3610.0100.2180.2691.0000.4550.0660.3170.1010.4600.179
marital-status-0.389-0.076-0.042-0.0620.0380.459-0.2070.4551.000-0.0250.0210.0830.314-0.064
native-country0.0080.0120.005-0.006-0.0590.0300.0100.066-0.0251.000-0.0060.267-0.013-0.010
occupation-0.0070.0150.0160.108-0.0010.3810.0130.3170.021-0.0061.0000.072-0.042-0.032
race0.0310.0140.0140.0650.0700.1150.0590.1010.0830.2670.0721.0000.0990.057
relationship-0.324-0.101-0.064-0.1000.0140.647-0.3090.4600.314-0.013-0.0420.0991.000-0.110
workclass0.0650.0270.0100.030-0.0310.1530.1190.179-0.064-0.010-0.0320.057-0.1101.000

Missing values

2025-07-12T08:02:55.102397image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
A simple visualization of nullity by column.
2025-07-12T08:02:55.304247image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

ageworkclassfnlwgteducational-nummarital-statusoccupationrelationshipracegendercapital-gaincapital-losshours-per-weeknative-countryincome
0253226802746321004039<=50K
138389814924041005039<=50K
228133695112211041004039>50K
34431603231026021768804039>50K
41821034971048340003039<=50K
5343198693647141003039<=50K
6292227026948421004039<=50K
763510462615210041310303239>50K
82433696671047440004039<=50K
9553104996422041001039<=50K
ageworkclassfnlwgteducational-nummarital-statusoccupationrelationshipracegendercapital-gaincapital-losshours-per-weeknative-countryincome
4883232334066625001004039<=50K
488334338466111212041004539<=50K
4883432311613814413111001136<=50K
488355333218651423041004039>50K
4883622331015210411141004039<=50K
4883727325730212213540003839<=50K
48838403154374926041004039>50K
48839583151910960440004039<=50K
48840223201490940341002039<=50K
488415242879279235401502404039>50K

Duplicate rows

Most frequently occurring

ageworkclassfnlwgteducational-nummarital-statusoccupationrelationshipracegendercapital-gaincapital-losshours-per-weeknative-countryincome# duplicates
202533081441342141004026<=50K3
01731530218412340002039<=50K2
1184378036844341001039<=50K2
21921674281048341004039<=50K2
319397261944141004039<=50K2
41931381531040340001039<=50K2
519313946610412340002539<=50K2
61931466791043321003039<=50K2
71932515791047341001439<=50K2
8193318822940140004039<=50K2